4 research outputs found
ConDistFL: Conditional Distillation for Federated Learning from Partially Annotated Data
Developing a generalized segmentation model capable of simultaneously
delineating multiple organs and diseases is highly desirable. Federated
learning (FL) is a key technology enabling the collaborative development of a
model without exchanging training data. However, the limited access to fully
annotated training data poses a major challenge to training generalizable
models. We propose "ConDistFL", a framework to solve this problem by combining
FL with knowledge distillation. Local models can extract the knowledge of
unlabeled organs and tumors from partially annotated data from the global model
with an adequately designed conditional probability representation. We validate
our framework on four distinct partially annotated abdominal CT datasets from
the MSD and KiTS19 challenges. The experimental results show that the proposed
framework significantly outperforms FedAvg and FedOpt baselines. Moreover, the
performance on an external test dataset demonstrates superior generalizability
compared to models trained on each dataset separately. Our ablation study
suggests that ConDistFL can perform well without frequent aggregation, reducing
the communication cost of FL. Our implementation will be available at
https://github.com/NVIDIA/NVFlare/tree/dev/research/condist-fl
Automated Pancreas Segmentation Using Multi-institutional Collaborative Deep Learning
The performance of deep learning-based methods strongly relies on the number
of datasets used for training. Many efforts have been made to increase the data
in the medical image analysis field. However, unlike photography images, it is
hard to generate centralized databases to collect medical images because of
numerous technical, legal, and privacy issues. In this work, we study the use
of federated learning between two institutions in a real-world setting to
collaboratively train a model without sharing the raw data across national
boundaries. We quantitatively compare the segmentation models obtained with
federated learning and local training alone. Our experimental results show that
federated learning models have higher generalizability than standalone
training.Comment: Accepted by MICCAI DCL Workshop 202
Detection of pancreatic cancer with two- and three-dimensional radiomic analysis in a nationwide population-based real-world dataset
Abstract Background CT is the major detection tool for pancreatic cancer (PC). However, approximately 40% of PCs < 2 cm are missed on CT, underscoring a pressing need for tools to supplement radiologist interpretation. Methods Contrast-enhanced CT studies of 546 patients with pancreatic adenocarcinoma diagnosed by histology/cytology between January 2005 and December 2019 and 733 CT studies of controls with normal pancreas obtained between the same period in a tertiary referral center were retrospectively collected for developing an automatic end-to-end computer-aided detection (CAD) tool for PC using two-dimensional (2D) and three-dimensional (3D) radiomic analysis with machine learning. The CAD tool was tested in a nationwide dataset comprising 1,477 CT studies (671 PCs, 806 controls) obtained from institutions throughout Taiwan. Results The CAD tool achieved 0.918 (95% CI, 0.895–0.938) sensitivity and 0.822 (95% CI, 0.794–0.848) specificity in differentiating between studies with and without PC (area under curve 0.947, 95% CI, 0.936–0.958), with 0.707 (95% CI, 0.602–0.797) sensitivity for tumors < 2 cm. The positive and negative likelihood ratios of PC were 5.17 (95% CI, 4.45–6.01) and 0.10 (95% CI, 0.08–0.13), respectively. Where high specificity is needed, using 2D and 3D analyses in series yielded 0.952 (95% CI, 0.934–0.965) specificity with a sensitivity of 0.742 (95% CI, 0.707–0.775), whereas using 2D and 3D analyses in parallel to maximize sensitivity yielded 0.915 (95% CI, 0.891–0.935) sensitivity at a specificity of 0.791 (95% CI, 0.762–0.819). Conclusions The high accuracy and robustness of the CAD tool supported its potential for enhancing the detection of PC
Federated learning for predicting clinical outcomes in patients with COVID-19
Federated learning (FL) is a method used for training artificial intelligence models with data from multiple sources while maintaining data anonymity, thus removing many barriers to data sharing. Here we used data from 20 institutes across the globe to train a FL model, called EXAM (electronic medical record (EMR) chest X-ray AI model), that predicts the future oxygen requirements of symptomatic patients with COVID-19 using inputs of vital signs, laboratory data and chest X-rays. EXAM achieved an average area under the curve (AUC) >0.92 for predicting outcomes at 24 and 72 h from the time of initial presentation to the emergency room, and it provided 16% improvement in average AUC measured across all participating sites and an average increase in generalizability of 38% when compared with models trained at a single site using that site's data. For prediction of mechanical ventilation treatment or death at 24 h at the largest independent test site, EXAM achieved a sensitivity of 0.950 and specificity of 0.882. In this study, FL facilitated rapid data science collaboration without data exchange and generated a model that generalized across heterogeneous, unharmonized datasets for prediction of clinical outcomes in patients with COVID-19, setting the stage for the broader use of FL in healthcare